Overview

Dataset statistics

Number of variables20
Number of observations89786
Missing cells90667
Missing cells (%)5.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.7 MiB
Average record size in memory160.0 B

Variable types

NUM13
CAT7

Warnings

Loan ID has a high cardinality: 81999 distinct values High cardinality
Customer ID has a high cardinality: 81999 distinct values High cardinality
Credit Score has 19155 (21.3%) missing values Missing
Annual Income has 19155 (21.3%) missing values Missing
Years in current job has 3803 (4.2%) missing values Missing
Months since last delinquent has 48338 (53.8%) missing values Missing
Annual Income is highly skewed (γ1 = 50.0048753) Skewed
Maximum Open Credit is highly skewed (γ1 = 127.4336386) Skewed
Loan ID is uniformly distributed Uniform
Customer ID is uniformly distributed Uniform
df_index has unique values Unique
Number of Credit Problems has 77484 (86.3%) zeros Zeros
Bankruptcies has 79880 (89.0%) zeros Zeros
Tax Liens has 88090 (98.1%) zeros Zeros

Reproduction

Analysis started2020-09-13 12:39:06.684498
Analysis finished2020-09-13 12:39:50.497004
Duration43.81 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct89786
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48101.05217
Minimum0
Maximum100000
Zeros1
Zeros (%)< 0.1%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile4513.25
Q122977.25
median47107
Q372834.75
95-th percentile94459.75
Maximum100000
Range100000
Interquartile range (IQR)49857.5

Descriptive statistics

Standard deviation28825.11814
Coefficient of variation (CV)0.5992616967
Kurtosis-1.191293725
Mean48101.05217
Median Absolute Deviation (MAD)24878.5
Skewness0.08045064019
Sum4318801070
Variance830887435.5
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20471< 0.1%
 
846681< 0.1%
 
785151< 0.1%
 
682761< 0.1%
 
662291< 0.1%
 
723741< 0.1%
 
703271< 0.1%
 
908091< 0.1%
 
969541< 0.1%
 
949071< 0.1%
 
Other values (89776)89776> 99.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
1000001< 0.1%
 
999991< 0.1%
 
999981< 0.1%
 
999971< 0.1%
 
999961< 0.1%
 

Loan ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct81999
Distinct (%)91.3%
Missing1
Missing (%)< 0.1%
Memory size701.5 KiB
d38a0226-7576-49a9-ad78-abca796889c0
 
2
b469909a-39a5-41c3-aa34-4a6f6440fe03
 
2
d0a72eb1-f516-48fb-9692-f60543f29751
 
2
dcc6b323-476b-435e-b950-ef7597687a55
 
2
e86bfd72-fe27-4a4d-8aad-d9abd61588af
 
2
Other values (81994)
89775 
ValueCountFrequency (%) 
d38a0226-7576-49a9-ad78-abca796889c02< 0.1%
 
b469909a-39a5-41c3-aa34-4a6f6440fe032< 0.1%
 
d0a72eb1-f516-48fb-9692-f60543f297512< 0.1%
 
dcc6b323-476b-435e-b950-ef7597687a552< 0.1%
 
e86bfd72-fe27-4a4d-8aad-d9abd61588af2< 0.1%
 
bb3c0ce9-e0f2-4313-810a-93cc21b376602< 0.1%
 
b91857fa-40b8-4561-ac27-ec742a1774ca2< 0.1%
 
2eb92fde-97cb-46a9-985a-95078ead36be2< 0.1%
 
3f5f619c-9a05-4640-b516-fc631f0a9f832< 0.1%
 
1def9b0c-562d-4666-8f6a-00e78db5f91b2< 0.1%
 
Other values (81989)89765> 99.9%
 
Frequencies of value counts

Unique

Unique74213 ?
Unique (%)82.7%
Histogram of lengths of the category

Length

Max length36
Median length36
Mean length35.99963246
Min length3

Customer ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct81999
Distinct (%)91.3%
Missing1
Missing (%)< 0.1%
Memory size701.5 KiB
dcf8b5c7-9116-414a-afca-e06bb640aaf8
 
2
428f53d2-05ed-40ef-9351-1c9e2ac54f33
 
2
39a1cb79-7e19-408f-983e-fb9872a3e1cf
 
2
fb10e551-f9b7-4e76-91db-24eb06c14c20
 
2
1c4f0e19-dc72-48ae-b1c9-f66136249085
 
2
Other values (81994)
89775 
ValueCountFrequency (%) 
dcf8b5c7-9116-414a-afca-e06bb640aaf82< 0.1%
 
428f53d2-05ed-40ef-9351-1c9e2ac54f332< 0.1%
 
39a1cb79-7e19-408f-983e-fb9872a3e1cf2< 0.1%
 
fb10e551-f9b7-4e76-91db-24eb06c14c202< 0.1%
 
1c4f0e19-dc72-48ae-b1c9-f661362490852< 0.1%
 
dbdf1932-e8f6-43a5-8210-21b15dd66f032< 0.1%
 
4950784a-1c40-44e1-b23e-d76c58b35ab02< 0.1%
 
347e5ead-a63a-432d-b497-607dd2d288ac2< 0.1%
 
249f4f50-957f-491c-9f5e-c43e768f40b92< 0.1%
 
62cd3cff-328a-4674-9264-4012c9f11b8d2< 0.1%
 
Other values (81989)89765> 99.9%
 
Frequencies of value counts

Unique

Unique74213 ?
Unique (%)82.7%
Histogram of lengths of the category

Length

Max length36
Median length36
Mean length35.99963246
Min length3

Loan Status
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size701.5 KiB
Fully Paid
67146 
Charged Off
22639 
ValueCountFrequency (%) 
Fully Paid6714674.8%
 
Charged Off2263925.2%
 
(Missing)1< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length11
Median length10
Mean length10.25206602
Min length3

Current Loan Amount
Real number (ℝ≥0)

Distinct22004
Distinct (%)24.5%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean13060894.72
Minimum10802
Maximum99999999
Zeros0
Zeros (%)0.0%
Memory size701.5 KiB

Quantile statistics

Minimum10802
5-th percentile77352
Q1180268
median313874
Q3532378
95-th percentile99999999
Maximum99999999
Range99989197
Interquartile range (IQR)352110

Descriptive statistics

Standard deviation33295559.59
Coefficient of variation (CV)2.549255645
Kurtosis2.964870276
Mean13060894.72
Median Absolute Deviation (MAD)149842
Skewness2.228130482
Sum1.172672432e+12
Variance1.108594288e+15
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
999999991148412.8%
 
22365224< 0.1%
 
22332224< 0.1%
 
21619424< 0.1%
 
22310223< 0.1%
 
21964822< 0.1%
 
10896622< 0.1%
 
21654622< 0.1%
 
17745222< 0.1%
 
22343222< 0.1%
 
Other values (21994)7809687.0%
 
ValueCountFrequency (%) 
108021< 0.1%
 
112421< 0.1%
 
154221< 0.1%
 
210981< 0.1%
 
214502< 0.1%
 
ValueCountFrequency (%) 
999999991148412.8%
 
7892503< 0.1%
 
7891844< 0.1%
 
78909613< 0.1%
 
7890307< 0.1%
 

Term
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size701.5 KiB
Short Term
66023 
Long Term
23762 
ValueCountFrequency (%) 
Short Term6602373.5%
 
Long Term2376226.5%
 
(Missing)1< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length9.735270532
Min length3

Credit Score
Real number (ℝ≥0)

MISSING

Distinct324
Distinct (%)0.5%
Missing19155
Missing (%)21.3%
Infinite0
Infinite (%)0.0%
Mean1130.830598
Minimum585
Maximum7510
Zeros0
Zeros (%)0.0%
Memory size701.5 KiB

Quantile statistics

Minimum585
5-th percentile664
Q1708
median729
Q3742
95-th percentile6920
Maximum7510
Range6925
Interquartile range (IQR)34

Descriptive statistics

Standard deviation1571.037395
Coefficient of variation (CV)1.389277402
Kurtosis10.72000817
Mean1130.830598
Median Absolute Deviation (MAD)15
Skewness3.56072444
Sum79871696
Variance2468158.498
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
74718252.0%
 
74017461.9%
 
74617421.9%
 
74117321.9%
 
74217231.9%
 
73916241.8%
 
74516121.8%
 
74815981.8%
 
74315551.7%
 
73814951.7%
 
Other values (314)5397960.1%
 
(Missing)1915521.3%
 
ValueCountFrequency (%) 
5859< 0.1%
 
5866< 0.1%
 
5879< 0.1%
 
58814< 0.1%
 
5895< 0.1%
 
ValueCountFrequency (%) 
75109< 0.1%
 
750024< 0.1%
 
749023< 0.1%
 
748043< 0.1%
 
7470510.1%
 

Annual Income
Real number (ℝ≥0)

MISSING
SKEWED

Distinct36174
Distinct (%)51.2%
Missing19155
Missing (%)21.3%
Infinite0
Infinite (%)0.0%
Mean1375894.388
Minimum76627
Maximum165557393
Zeros0
Zeros (%)0.0%
Memory size701.5 KiB

Quantile statistics

Minimum76627
5-th percentile520011
Q1847932
median1168975
Q31648915
95-th percentile2805540
Maximum165557393
Range165480766
Interquartile range (IQR)800983

Descriptive statistics

Standard deviation1104851.699
Coefficient of variation (CV)0.8030061816
Kurtosis6955.693692
Mean1375894.388
Median Absolute Deviation (MAD)380456
Skewness50.0048753
Sum9.71807965e+10
Variance1.220697276e+12
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
116257220< 0.1%
 
114000017< 0.1%
 
96947516< 0.1%
 
111264015< 0.1%
 
114661215< 0.1%
 
97337014< 0.1%
 
114376214< 0.1%
 
132029113< 0.1%
 
125143513< 0.1%
 
94990513< 0.1%
 
Other values (36164)7048178.5%
 
(Missing)1915521.3%
 
ValueCountFrequency (%) 
766271< 0.1%
 
810921< 0.1%
 
948671< 0.1%
 
970331< 0.1%
 
1065331< 0.1%
 
ValueCountFrequency (%) 
1655573931< 0.1%
 
364754401< 0.1%
 
308389951< 0.1%
 
280953001< 0.1%
 
241615401< 0.1%
 

Years in current job
Categorical

MISSING

Distinct11
Distinct (%)< 0.1%
Missing3803
Missing (%)4.2%
Memory size701.5 KiB
10+ years
27755 
2 years
8254 
< 1 year
7365 
3 years
7339 
5 years
6136 
Other values (6)
29134 
ValueCountFrequency (%) 
10+ years2775530.9%
 
2 years82549.2%
 
< 1 year73658.2%
 
3 years73398.2%
 
5 years61366.8%
 
1 year58326.5%
 
4 years55116.1%
 
6 years51345.7%
 
7 years49895.6%
 
8 years41214.6%
 
(Missing)38034.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length9
Median length7
Mean length7.465896688
Min length3

Home Ownership
Categorical

Distinct4
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size701.5 KiB
Home Mortgage
43548 
Rent
37855 
Own Home
8199 
HaveMortgage
 
183
ValueCountFrequency (%) 
Home Mortgage4354848.5%
 
Rent3785542.2%
 
Own Home81999.1%
 
HaveMortgage1830.2%
 
(Missing)1< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length13
Median length8
Mean length8.746742254
Min length3

Purpose
Categorical

Distinct16
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size701.5 KiB
Debt Consolidation
70834 
Home Improvements
 
5237
other
 
5235
Other
 
2882
Business Loan
 
1366
Other values (11)
 
4231
ValueCountFrequency (%) 
Debt Consolidation7083478.9%
 
Home Improvements52375.8%
 
other52355.8%
 
Other28823.2%
 
Business Loan13661.5%
 
Buy a Car11651.3%
 
Medical Bills9831.1%
 
Buy House5820.6%
 
Take a Trip4880.5%
 
major_purchase3300.4%
 
Other values (6)6830.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length20
Median length18
Mean length16.35724946
Min length3

Monthly Debt
Real number (ℝ≥0)

Distinct65765
Distinct (%)73.2%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean18396.90641
Minimum0
Maximum435843.28
Zeros70
Zeros (%)0.1%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile3700.82
Q110157.4
median16140.31
Q323918.91
95-th percentile40305.84
Maximum435843.28
Range435843.28
Interquartile range (IQR)13761.51

Descriptive statistics

Standard deviation12145.28237
Coefficient of variation (CV)0.6601806902
Kurtosis23.83645015
Mean18396.90641
Median Absolute Deviation (MAD)6698.26
Skewness2.257836539
Sum1651766242
Variance147507883.9
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0700.1%
 
159039< 0.1%
 
10647.988< 0.1%
 
13359.858< 0.1%
 
14726.527< 0.1%
 
11380.247< 0.1%
 
17132.116< 0.1%
 
22027.656< 0.1%
 
18926.096< 0.1%
 
24193.656< 0.1%
 
Other values (65755)8965299.9%
 
ValueCountFrequency (%) 
0700.1%
 
7.411< 0.1%
 
12.921< 0.1%
 
17.11< 0.1%
 
19.571< 0.1%
 
ValueCountFrequency (%) 
435843.281< 0.1%
 
229057.921< 0.1%
 
205801.351< 0.1%
 
173265.561< 0.1%
 
172156.151< 0.1%
 

Years of Credit History
Real number (ℝ≥0)

Distinct506
Distinct (%)0.6%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean18.24864176
Minimum3.6
Maximum70.5
Zeros0
Zeros (%)0.0%
Memory size701.5 KiB

Quantile statistics

Minimum3.6
5-th percentile9
Q113.5
median17
Q321.7
95-th percentile31.7
Maximum70.5
Range66.9
Interquartile range (IQR)8.2

Descriptive statistics

Standard deviation7.034607126
Coefficient of variation (CV)0.3854866143
Kurtosis1.762972872
Mean18.24864176
Median Absolute Deviation (MAD)4
Skewness1.078179316
Sum1638454.3
Variance49.48569742
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1612111.3%
 
1511941.3%
 
1710891.2%
 
16.510471.2%
 
1410311.1%
 
15.49591.1%
 
17.59261.0%
 
139241.0%
 
14.58781.0%
 
188671.0%
 
Other values (496)7965988.7%
 
ValueCountFrequency (%) 
3.61< 0.1%
 
3.72< 0.1%
 
3.83< 0.1%
 
3.93< 0.1%
 
46< 0.1%
 
ValueCountFrequency (%) 
70.51< 0.1%
 
652< 0.1%
 
60.52< 0.1%
 
59.91< 0.1%
 
59.71< 0.1%
 

Months since last delinquent
Real number (ℝ≥0)

MISSING

Distinct116
Distinct (%)0.3%
Missing48338
Missing (%)53.8%
Infinite0
Infinite (%)0.0%
Mean34.97587338
Minimum0
Maximum176
Zeros197
Zeros (%)0.2%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile5
Q116
median32
Q351
95-th percentile75
Maximum176
Range176
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.00858534
Coefficient of variation (CV)0.6292504866
Kurtosis-0.7651843534
Mean34.97587338
Median Absolute Deviation (MAD)17
Skewness0.4281797601
Sum1449680
Variance484.3778289
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
138160.9%
 
127810.9%
 
147650.9%
 
97630.8%
 
157560.8%
 
87510.8%
 
167460.8%
 
107450.8%
 
77430.8%
 
67410.8%
 
Other values (106)3384137.7%
 
(Missing)4833853.8%
 
ValueCountFrequency (%) 
01970.2%
 
12480.3%
 
23610.4%
 
33880.4%
 
44600.5%
 
ValueCountFrequency (%) 
1761< 0.1%
 
1521< 0.1%
 
1481< 0.1%
 
1431< 0.1%
 
1411< 0.1%
 

Number of Open Accounts
Real number (ℝ≥0)

Distinct51
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean11.12324999
Minimum0
Maximum76
Zeros2
Zeros (%)< 0.1%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile5
Q18
median10
Q314
95-th percentile20
Maximum76
Range76
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.998883993
Coefficient of variation (CV)0.449408581
Kurtosis3.055089295
Mean11.12324999
Median Absolute Deviation (MAD)3
Skewness1.17844177
Sum998701
Variance24.98884117
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
984329.4%
 
1081089.0%
 
879338.8%
 
1177038.6%
 
772908.1%
 
1267177.5%
 
660766.8%
 
1356226.3%
 
1446425.2%
 
542454.7%
 
Other values (41)2301725.6%
 
ValueCountFrequency (%) 
02< 0.1%
 
122< 0.1%
 
23920.4%
 
311951.3%
 
425452.8%
 
ValueCountFrequency (%) 
762< 0.1%
 
562< 0.1%
 
522< 0.1%
 
482< 0.1%
 
473< 0.1%
 

Number of Credit Problems
Real number (ℝ≥0)

ZEROS

Distinct14
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.164983015
Minimum0
Maximum15
Zeros77484
Zeros (%)86.3%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4780104753
Coefficient of variation (CV)2.897331434
Kurtosis49.09555763
Mean0.164983015
Median Absolute Deviation (MAD)0
Skewness4.866898689
Sum14813
Variance0.2284940145
MonotocityNot monotonic
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
07748486.3%
 
11064711.9%
 
211361.3%
 
33320.4%
 
41110.1%
 
544< 0.1%
 
613< 0.1%
 
78< 0.1%
 
83< 0.1%
 
92< 0.1%
 
Other values (4)5< 0.1%
 
ValueCountFrequency (%) 
07748486.3%
 
11064711.9%
 
211361.3%
 
33320.4%
 
41110.1%
 
ValueCountFrequency (%) 
151< 0.1%
 
121< 0.1%
 
111< 0.1%
 
102< 0.1%
 
92< 0.1%
 

Current Credit Balance
Real number (ℝ≥0)

Distinct32730
Distinct (%)36.5%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean294035.132
Minimum0
Maximum32878968
Zeros526
Zeros (%)0.6%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile30210
Q1112936
median209722
Q3367517
95-th percentile760220.4
Maximum32878968
Range32878968
Interquartile range (IQR)254581

Descriptive statistics

Standard deviation372227.6983
Coefficient of variation (CV)1.265929332
Kurtosis769.2294284
Mean294035.132
Median Absolute Deviation (MAD)114798
Skewness14.54721549
Sum2.639994433e+10
Variance1.385534594e+11
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
05260.6%
 
17597815< 0.1%
 
13780715< 0.1%
 
11168214< 0.1%
 
6769714< 0.1%
 
14884614< 0.1%
 
10603914< 0.1%
 
6568313< 0.1%
 
12401313< 0.1%
 
9422113< 0.1%
 
Other values (32720)8913499.3%
 
ValueCountFrequency (%) 
05260.6%
 
1911< 0.1%
 
388< 0.1%
 
576< 0.1%
 
764< 0.1%
 
ValueCountFrequency (%) 
328789681< 0.1%
 
129869561< 0.1%
 
127463971< 0.1%
 
117964351< 0.1%
 
113619241< 0.1%
 

Maximum Open Credit
Real number (ℝ≥0)

SKEWED

Distinct44596
Distinct (%)49.7%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean775656.505
Minimum0
Maximum1539737892
Zeros628
Zeros (%)0.7%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile111078
Q1277068
median472692
Q3791450
95-th percentile1660720.6
Maximum1539737892
Range1539737892
Interquartile range (IQR)514382

Descriptive statistics

Standard deviation8803517.441
Coefficient of variation (CV)11.34976292
Kurtosis18684.96184
Mean775656.505
Median Absolute Deviation (MAD)233332
Skewness127.4336386
Sum6.964076799e+10
Variance7.750191934e+13
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
06280.7%
 
15019411< 0.1%
 
23720411< 0.1%
 
38420810< 0.1%
 
15281210< 0.1%
 
23641210< 0.1%
 
32392810< 0.1%
 
21980210< 0.1%
 
22096810< 0.1%
 
64488610< 0.1%
 
Other values (44586)8906399.2%
 
ValueCountFrequency (%) 
06280.7%
 
43342< 0.1%
 
44441< 0.1%
 
53901< 0.1%
 
64465< 0.1%
 
ValueCountFrequency (%) 
15397378921< 0.1%
 
13047261701< 0.1%
 
9803052601< 0.1%
 
7982553701< 0.1%
 
6324777361< 0.1%
 

Bankruptcies
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)< 0.1%
Missing191
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean0.1155310006
Minimum0
Maximum7
Zeros79880
Zeros (%)89.0%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3479189551
Coefficient of variation (CV)3.011477034
Kurtosis18.13993235
Mean0.1155310006
Median Absolute Deviation (MAD)0
Skewness3.50423627
Sum10351
Variance0.1210475993
MonotocityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
07988089.0%
 
1923310.3%
 
23690.4%
 
3820.1%
 
424< 0.1%
 
55< 0.1%
 
71< 0.1%
 
61< 0.1%
 
(Missing)1910.2%
 
ValueCountFrequency (%) 
07988089.0%
 
1923310.3%
 
23690.4%
 
3820.1%
 
424< 0.1%
 
ValueCountFrequency (%) 
71< 0.1%
 
61< 0.1%
 
55< 0.1%
 
424< 0.1%
 
3820.1%
 

Tax Liens
Real number (ℝ≥0)

ZEROS

Distinct12
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.02860452682
Minimum0
Maximum15
Zeros88090
Zeros (%)98.1%
Memory size701.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2551075273
Coefficient of variation (CV)8.918431996
Kurtosis409.2835831
Mean0.02860452682
Median Absolute Deviation (MAD)0
Skewness15.61181423
Sum2568
Variance0.06507985046
MonotocityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%) 
08809098.1%
 
111741.3%
 
23230.4%
 
31000.1%
 
4520.1%
 
515< 0.1%
 
610< 0.1%
 
76< 0.1%
 
93< 0.1%
 
151< 0.1%
 
Other values (2)2< 0.1%
 
(Missing)10< 0.1%
 
ValueCountFrequency (%) 
08809098.1%
 
111741.3%
 
23230.4%
 
31000.1%
 
4520.1%
 
ValueCountFrequency (%) 
151< 0.1%
 
111< 0.1%
 
101< 0.1%
 
93< 0.1%
 
76< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexLoan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
0014dd8831-6af5-400b-83ec-68e61888a048981165ec-3274-42f5-a3b4-d104041a9ca9Fully Paid445412.0Short Term709.01167493.08 yearsHome MortgageHome Improvements5214.7417.2NaN6.01.0228190.0416746.01.00.0
114771cc26-131a-45db-b5aa-537ea4ba53422de017a3-2e01-49cb-a581-08169e83be29Fully Paid262328.0Short TermNaNNaN10+ yearsHome MortgageDebt Consolidation33295.9821.18.035.00.0229976.0850784.00.00.0
224eed4e6a-aa2f-4c91-8651-ce984ee8fb265efb2b2b-bf11-4dfd-a572-3761a2694725Fully Paid99999999.0Short Term741.02231892.08 yearsOwn HomeDebt Consolidation29200.5314.929.018.01.0297996.0750090.00.00.0
3377598f7b-32e7-4e3b-a6e5-06ba0d98fe8ae777faab-98ae-45af-9a86-7ce5b33b1011Fully Paid347666.0Long Term721.0806949.03 yearsOwn HomeDebt Consolidation8741.9012.0NaN9.00.0256329.0386958.00.00.0
44d4062e70-befa-4995-8643-a0de7393818281536ad9-5ccf-4eb8-befb-47a4d608658eFully Paid176220.0Short TermNaNNaN5 yearsRentDebt Consolidation20639.706.1NaN15.00.0253460.0427174.00.00.0
5589d8cb0c-e5c2-4f54-b056-48a645c543dd4ffe99d3-7f2a-44db-afc1-40943f1f9750Charged Off206602.0Short Term7290.0896857.010+ yearsHome MortgageDebt Consolidation16367.7417.3NaN6.00.0215308.0272448.00.00.0
66273581de-85d8-4332-81a5-19b04ce6866690a75dde-34d5-419c-90dc-1e58b04b3e35Fully Paid217646.0Short Term730.01184194.0< 1 yearHome MortgageDebt Consolidation10855.0819.610.013.01.0122170.0272052.01.00.0
77db0dc6e1-77ee-4826-acca-772f9039e1c7018973c9-e316-4956-b363-67e134fb0931Charged Off648714.0Long TermNaNNaN< 1 yearHome MortgageBuy House14806.138.28.015.00.0193306.0864204.00.00.0
888af915d9-9e91-44a0-b5a2-564a45c12089af534dea-d27e-4fd6-9de8-efaa52a78ec0Fully Paid548746.0Short Term678.02559110.02 yearsRentDebt Consolidation18660.2822.633.04.00.0437171.0555038.00.00.0
990b1c4e3d-bd97-45ce-9622-22732fcdc9a0235c4a43-dadf-483d-aa44-9d6d77ae4583Fully Paid215952.0Short Term739.01454735.0< 1 yearRentDebt Consolidation39277.7513.9NaN20.00.0669560.01021460.00.00.0

Last rows

df_indexLoan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
8977699988125a4df1-c538-4b1e-b37f-b515a2ce370c82488d29-ef29-4f1c-a786-dee1b257dee5Charged Off309474.0Short TermNaNNaN10+ yearsHome MortgageDebt Consolidation13817.1826.9NaN15.00.0225872.0892606.00.00.0
89777999892649e526-3866-4555-91a3-9bd5d26dcba29ff0267b-fa13-4f8e-b625-e8c36b6685e0Charged Off429132.0Short TermNaNNaN10+ yearsHome MortgageDebt Consolidation28948.0216.3NaN16.00.0485279.0656414.00.00.0
8977899990686017b3-dc24-4f8a-af92-0bd077452d3d1a583add-21ba-410f-9c42-757c4ed19322Fully Paid99999999.0Short Term742.01190046.0< 1 yearRentother11969.8120.116.09.00.037392.0134442.00.00.0
8977999992c568adaa-16f9-43d3-b522-8532fb57cb16cbb29fd6-e418-4f09-a4bd-4de83428caabFully Paid48796.0Short TermNaNNaN4 yearsHome Mortgagemajor_purchase8298.638.3NaN9.00.087875.0239404.00.00.0
89780999948506a4e9-af7d-47d2-a1bf-7ea2c41858f0be67200e-1ef1-4b63-86a6-2bf27d3c704dFully Paid210584.0Short Term719.0783389.01 yearHome MortgageOther3727.6117.418.06.00.0456.0259160.00.00.0
897819999606eba04f-58fc-424a-b666-ed72aa00890077f2252a-b7d1-4b07-a746-1202a8304290Fully Paid99999999.0Short Term732.01289416.01 yearRentDebt Consolidation13109.059.421.022.00.0153045.0509234.00.00.0
8978299997e1cb4050-eff5-4bdb-a1b0-aabd3f7eaac72ced5f10-bd60-4a11-9134-cadce4e7b0a3Fully Paid103136.0Short Term742.01150545.06 yearsRentDebt Consolidation7315.5718.818.012.01.0109554.0537548.01.00.0
897839999881ab928b-d1a5-4523-9a3c-271ebb01b4fb3e45ffda-99fd-4cfc-b8b8-446f4a505f36Fully Paid530332.0Short Term746.01717524.09 yearsRentDebt Consolidation9890.0715.0NaN8.00.0404225.0738254.00.00.0
8978499999c63916c6-6d46-47a9-949a-51d09af4414f1b3014be-5c07-4d41-abe7-44573c375886Fully Paid99999999.0Short Term743.0935180.0NaNOwn HomeDebt Consolidation9118.1013.0NaN4.01.045600.091014.01.00.0
89785100000NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN